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ABSTRACT 


We  consider  infinite  dimensional  optimization  problems  involving 
entropy-type  functionals  in  the  objective  function  as  well  as  in  the 
constraints.  A  duality  theory  is  developed  for  such  problems  and 
applied  to  the  reliability  rate  function  problem  in  Information 
Theory. 


Key  Words :  Optimization  in  Infinite  Dimensional  Spaces; 

Duality  in  Convex  Optimization;  Entropy;  Divergence;  Information 
Theory;  Channel  Capacity;  Reliability  Rate  Function;  Error 
Exponent  Function. 


1.  INTRODUCTION 


Extremum  problems  involving  entropy-type  functionals  appear  in 
a  diversity  of  apul  i.  ca‘ ions .  To  mention  just  few:  st.iti  .ical 
estimation  and  hypothesis  testing  (Kullback-Leibler  (Ref.  1),  Kullback 
(Ref.  2),  Akaike  (Ref.  3)),  traffic  engineering  (Chames  et  al.  (Ref.  4)), 
marketing  (Charnes  et  al  (Ref.  5)),  accounting  (Chames  and  Cooper 
(Refs.  6,7)),  information  theory  (Shannon  (Ref.  8)). 

In  the  majority  of  these  applications,  the  extremum  problems 
involved  are  studied  only  for  the  case  of  finite  distributions.  Extensions 
to  arbitrary  distributions  were  derived  recently  by  Ben-Tal  and  Chames 
(Ref.  9).  The  extremum  problem  is  set  up  as  an  infinite  dimensional 
convex  program  with  linear  equality  constraints,  namely: 


inf  { 
f£D 


f  ( t)  a .  (t)  dt  = 
l 


=  1, . . . ,m}, 


where  D  is  the  convex  subset  of  density  functions  with  support  T, 

and  g(')  is  a  given  density  in  D. 

It  is  shown  in  Ref.  9  that  the  dual  problem  is  the  unconstrained 

finite  dimensional  concave  program-. 

m 

I  y  a. (t) 

,  L  i  i  1 

(B)  sup  {y  9  -  log  gttle1  dt } 

Y£lRm  l 


The  dual  pair  (A) -(B)  has  a  very  interesting  statistical  inter¬ 
pretation:  let  {0  be  parameters  of  the  distribution,  estimated 

in  terms  of  a  sample  x  =  (x^,...,x  )  by 

9.  (x)  =  9  (x  ,...,x  )  =  —  (a . (x  )  +...*-  a  (x  )) 
l  ll  nnil  in 


and  Let  these  estimates  reolace  6.  in  the  constraints  of 

l 


(A)  . 
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Consider  now  the  problems  of  finding  the  maximum  likelihood  estimator 
n*(x)  of  the  parameter  vector  tt  =  (tt^  ,  .  .  .  ,tt  )  ^  in  the  exponent  ial 
family  generated  by  the  (fixed)  density  n(t),  i.e.: 


f(t|n)  =  g(t)  c(ir)e 


y  it  .  a  .  (t) 
i=l  1  1 


where  c(tt)  is  a  normalizing  constant,  i.e.: 

m 

,  I  TT.a.  (t) 

(  .11 

c(rr)  1  =  J  g(t)ei_1  dt. 

T 

The  likelihood  function  is 


y  y  it  .  a  .  (x  . ) 

n  n  ‘‘H  ] 

n  f  ( x  .  |  tt  )  =  {  n  g(x.)}*c(^)  e  ^  1 

j-1  3  j-1  3 


hence  —  log (likelihood)  =  const 

n 


r  Ve.<x)  - 

L  i  i 

.  +  log  c(ir)e  J  there  for 


the  maximum  likelihood  estimator  Tt  (x)  is  obtained  by  solving: 


A 

max  {  y tt  .  0  .  (x)  -  log  c  (u)  }  = 

_  m  L  l  i 
it  OR 

m 

y  TT.a.  (t) 

m  r  i“1  i  i 

max  {  y  n.0.(x)  -  log  g(t)e 
m  . u ,  i  i  I 

tt  6  R  i=l  L 

T 


The  latter  is  precisely  the  dual  problem  (B) .  Thus,  for  the 
exponential  family.  Statistical  Information  theory  and  the  Maximum 
Likelihood  approach  are  dual  principles. 

Many  problems  in  information  theory,  however,  cannot  be 
stated  just  with  linear  constraints  as  in  problem  (A) ,  they  contain 
also  (nonlinear)  entropy  type  inequality  constraints.  It  is  the  purpose 
of  this  paper  to  derive  duality  results  for  such  problems  and  to 
demonstrate  their  power  and  elegance  in  treating  such  problems. 
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As  a  motivation  we  begin  by  describing  the  channel  capacity 
problem  of  Information  Theory.  Consider  a  communication  channel 
described  by  an  input  alphabet  A  =  {l,...,n},  an  output  alphabet 
B  =  m}  and  by  a  probability  transition  matrix 

Q  =  (Q(k|j)}  ,  where  Q(k|j)  is  the  probability  of  receiving  the 
output  letter  k  G  B  when  input  letter  j  G  A  was  transmitted. 

The  capacity  of  the  channel  is  defined  as: 


l  P,Q<*U> 

.  n=i 

where 

A  n 

lPn  =  {pG  JR0;  p.  }  0  Vj;  J  p.  =  1}  (2) 

3  j=l  3 

is  the  set  of  all  probability  distributions  on  the  channel  input, 
and  I(p,Q)  is  known  as  the  average  mutual  information  between 
the  channel  input  and  channel  output.  Channel  capacity  is  the  basic 
concept  of  Shannon's  mathematical  theory  of  communication  (later 
called  Information  theory) .  For  more  details  on  the  notion  of 
capacity  and  its  significance,  the  reader  is  referred  to  Shannon 
(Ref.  8),  Gal  lager  (Ref.  10),  Jelinek  (Ref.  11). 

Roughly  speaking,  the  basic  theorem  of  information  theory,  the 
so-called  "noisy  channel  coding  theorem",  states  that  if  the  channel 
has  capacity  C,  it  is  possible  to  transmit  over  this  channel 
messages,  of  sufficiently  large  length,  at  rate  R<C  and  still 
be  able  to  decode  them  with  a  probability  of  error  as  small  as 
desired.  Upper  bound  on  the  probability  of  error  is  given  in  terms 
of  an  exponential  decreasing  function  of  the  so-called  reliability 
rate  function  E(R).  In  the  classical  proof  of  the  coding  theorem, 


j)  log 


Siisi  j ) 


(l) 


C  =  max  (I (p,Q)  =  max 
pG?n  pGRn 


m  n 

l  l  P.Q^i 

k=l  j=l  J 
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the  function  E(R)  is  derived  via  a  sequence  of  mathematical  mani¬ 
pulations,  see  e.g.,  Gallager  (Ref.  12)  and  Csiszar  (Ref.  13). 

Blahut  (Ref.  14),  has  enlightened  many  basic  problems  of  coding  theory 
by  defining  E (R)  as  a  saddle  function  problem,  involving  the  Kullback- 
Leibler  relative  entropy  functional  namely,  for  a  given  channel 
matrix  P(k| j) : 

E(R)  =  max  min  £  £  p  Q(k|j)log  (3) 

p€P  Q€0(R)  k=l  j=l  3  P(k|D) 

where 

Q.(R)  =  {Q:  I(p,Q)  $  R},  R  a  positive  scalar. 

Starting  from  this  definition,  Blahut  (Ref.  14)  proved*  that 
E(R)  can  be  expressed  by  the  conventional  parametric  form  originally 
proposed  by  Gallager  (Ref.  12)  namely, 

1 

m  n  t~t  i  * 

E (R)  =  max  max  {-5R-log  £  {  £  p.P(k|j)  }  }  (4) 

<5^0  p€lPn  k=l  j  =  l  3 

A  new  proof  of  this  result  is  given  here  in  Section  3,  via  the  duality 
theory  developed  in  Section  2.  The  duality  framework  can  be  applied 
to  a  variety  of  other  extremum  problems  of  information  theory,  (see 
e.g.,  Blahut  (Ref.  14),  Table  I,  p.  417). 

In  particular,  more  than  one  entropy-type  constraint  can  be 
easily  dealt  with,  and  the  general  (not  necessarily  discrete)  distribu¬ 


tion  case  can  be  considered. 


2.  DUALITY  THEORY  FOR  LINEAR  AND  ENTROPY  CONSTRAINED  PROGRAMS 

Let  dt  be  a  o-finite  additive  measure  defined  on  a  a-field 
of  the  subsets  of  a  measurable  space  T,  and  let  L^  =  L^(T,dt)  be 
the  usual  Lebesgue  space  of  measurable  real  valued  functions  x  on 
T  so  that 

II  xl!  =  1 1  x  ( t)  [dt  <  00  . 

T 

Let  ID  =  (x  £  L  :  x  ( t )  >,  O(a.e),  x(t)dt  =  1}  be  the  convex  subset 
1  T 

of  L  which  is  the  set  of  all  probability  densities  x(*)  on  T 
Consider  the  infinite  dimensional  optimization  problem: 


i  6  I  =  {1, . . . ,m} 

k  £  K  *  {1 . p} 


(5) 

(6) 


where  cfc:  T  +®,  k  i  {0}  U  K  are  given  summable  positive  functions 

a.:  T  -+LR  are  given  continuous  functions;  and  {b.  {e,  },  - 

1  1  ltl  k  k£K 

are  given  real  numbers. 

Here  and  henceforth,  0  log  0  =  lim  t  log  t  =  0.  A  dual 

t-K)+ 

representation  of  problem  (P)  will  be  derived  via  I agran  j i an  duality 
Recall  that  for  a  convex  optimization  problem: 

(A)  inf  (f(x):  g(x)  f  0  x  £  C  c:  X; 

where  f:  C  -♦  TR,  g:  C  -*•  iRm  are  convex  functions  defined  on  a  convex 
subset  C  of  a  linear  space  X,  the  Lagr ang 1  in  for  problem  (A)  is 


defined  as  L:  C  *  IR+  ■*  IR  given  by: 


L(x,y)  =  f (x)  +  y  g(x)  . 


The  dual  objective  function  is 


h(y)  =  inf  L(x,y) 

x€c 

and  then  the  dual  problem  (B)  associated  with  (A)  is  defined  as: 


(B)  sup  h (y) . 

y}0 

The  main  result  concerning  the  dual  pair  (A)  and  (B)  is  the  existence 
of  a  saddle  point  (x*,y*)  for  (A)  or  equivalently,  the  validity  of 
a  strong  duality  result: 


inf (A)  *  max (B) 


Under  the  familiar  Slater  regularity  condition: 


(S)  3 x  €  C:  g (x)  <  0 


the  strong  duality  relation  is  guaranteed.  More  precisely  we  have: 
(see  e.g.,  Rockafellar  (Refs.  15,16),  Laurent  (Ref.  17),  Ponstein 
(Ref.  18)) 

Theorem  2 . 1  Assume  that  inf (A)  <  ®  and  that  the  regularity 
assumption  (S)  holds  then 


inf (A)  =  max (B) . 


Remark  2.1  The  regularity  condition  (S)  is,  in  fact,  related 
to  the  notion  of  stably  set  problem.  More  details  are  available 

(*)  We  follow  the  convention  of  writing  "min"  ("max")  if  the  infinum 
(supremum)  is  attained. 


"  '-V  *.i 


in  Rock  a  f  -A  La  r  (Ref.  IS)  arid  Uuront  (Ref.  17)  (especially  i'r.eocem 
7.6.1,  p.  40  1). 

Remark  2.2  A  result  of  the  type  of  Theorem  1  has  typically  a 
symmetric  version,  i.e.,  if  (B)  is  assumed  stably  set  then 
min  (A)  =  sip  (3),  (see  Rockafellar  (Ref.  15),  Theorem  4,  p.  179). 

We  now  return  to  the  primal  "entropy  problem"  (Pi .  The 
derivation  of  its  dual  objective  function  is  based  on  the  following 
simple  result. 
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The  Lagrangian  for  problem  (P)  is  L:  D  *IR+  x  n<-  -*•  XR 


L(x ,y , X)  =  bCy  -  e&  X  -t 


{log  — -- £  y.a.(t)  +  £  X  log  — rrr  }x(t)dt 

c  t)  .£t  1  1  ,  “  k  c.  (t) 

o  i£l  k£K  k 

(7) 


x  ( t ) 


and  thus  the  dual  problem  (D)  associated  with  (P)  is  defined  as: 


sup{  inf  L(x,y,X)  :  y  £  3Rm,  X  £  }  - 

x€D  +  + 

The  next  result  shows  that  the  dual  problem  (D)  can  be  expressed  simply 
as  a  finite  dimensional  concave  program  involving  only  nonnegative 
constraints . 


Theorem  2 . 2  The  dual  problem  of  (P)  is  given  by: 


(D)  sup  {ySa  -  Xte  -  p  log 
X£r£ 


^KXtB(t)  +yfcA(t) } 


c  (t)  e 
o 


dt} 


where : 


p  =  1  +  )  X.  ,  and 

k-1  K 


with 


A  ( t)  =  (a,(t) . a  ( t) )  ;  B(t)  =  (B ,(t),. 

i  m  i 


c  { t ) 

Bk(t)  =  log  Vk  £  K  =  (l,...,p). 


. ,B  (t)  ) 
P 


Proof :  The  Lagrangian  defined  in  (7)  can  be  written  after  some 

algebraic  manipulations  as: 


(1 


L(x,y ,  X)  =  -y^b  -  Xte  + 


x  { t)  log 


x(t) 


*  K 

k=l  * 


(  n  c(t)\eytA(t) 

k=0  k 


dt 


.■I*  aft,  mm  **t***imm 


g 


p 


c,  ( t ) 
k 


Then,  defining  p=  1  +  I  and  B^(t)  =  lot3  ” '"(tT  '  a 

k=l  o' 

algebra  shows  that  the  dual  objective  function  can  be  expressed  a: 


h ( y , X )  =ytb-Xte+p  inf 

x  C  I) 


,[Uk, - rji) - 


at  . 


c  ( t)  e 
o 


VytA  (  t)  } 
P 


— {  A  B ( t ) +ytA(t)  } 

Now,  applying  Lenina  2.1  with  s(t)  =  cQ(t)e^  we  get 


the  desired  result. 


Duality  results  for  the  pair  of  problems  (P) - (D)  will  now  follow 
by  setting  problem  (P)  as  a  convex  program  of  the  type  (A)  and  then 
applying  Theorem  2.1. 


Theorem  2.3 


(a)  If  (P)  is  feasible  then  inf(P)  is  attained  and 
min (P)  =  sup (D) . 

Moreover,  if  there  exists  x  £  ID  satisfying  the  constraints 
(5),  (6)  strictly,  then  sup(D)  is  attained  and 

min  (P)  =  max (D) . 


(b)  If  x*  6  ID  solves  (P)  and  y*  £  tp™,  A*  £  IR°  solves  (D)  then: 


x  (t)  = 


c  (t)  e 
o 


1  „  *  h  *  b 

— t  A  B(t)+y  A  ( t)  } 
P 


1  *  ►  *  t- 

-{X  a  (t)  -t-y  A  ( t )  } 

c  ( t)  e*3  dt 

o 


[  a  .  e  ] 


Pj  'Of:  In  order  to  apply  Theorem  2.1,  we  need  to  set  problem  (P)  in 

the  format  of  the  convex  program  (A).  Thus,  consider  the  lineir 
1  m 

Operator  A:  t_  *-33  given  by: 
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|  a^ ( t) x ( t ) dt 1 


a  (t)x(t)dt 
m 


\ 


and  for  k  £  {0}  JK,  define  the  integral  functionals 


x(t)  log  -  - ~  —  dt  if  x  £  B 


I,  (x)  = 
k 


c  ( t) 
k 


®  otherwise. 

Then  problem  (P)  cam  be  written  as  a  convex  optimization  problem. 


(P)  inf { I  (x)  :  Ax  ^  b,  I  (x)  $  e  k  £  K,  x  £tt)}. 

O  K  K 

Note  that  (P)  corresponds  to  (A)  with 

X:*!.1,  C:=D,  f(x):  =  I  (x)  and  g(x):  = 

o 

and  then  the  results  follow  from  Theorem  2.1.  In  fact,  since  the 
dual  (D) ,  given  in  Theorem  2.2,  has  only  nonnegative  constraints 
(y  i  0,  X  >,  0)  ,  it  satisfies  the  strongest  constraint,  implying 
by  Remark  2.2  lack  of  duality  gap  and  attainment  of  the  primal 
infimum.  Thus  the  first  part  of  conclusion  (a)  follows.  The 
second  part  follows  directly  from  Theorem  2.1  itself.  Moreover, 
part  (a)  implies  the  existence  of  a  saddle  point 
<x*(t),y\x*)  £  I)  x  ir™  xiR°,  so 


/  b-Ax  , 
/Il(?,'el 


I  (x) -e  I 
P  P  ; 


min  L(x,y*,X*)  =  L(x*,y*,X*) 
x  £m 

and  the  expression  for  x*  given  in  (b)  follows  from  the  last  part 
of  Lemma  2.1. 

o 
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3.  AN  APPLICATION  IN  INFORMATION  THKORY 


where 


IPir)  =  (q  €  IP°:  J(q,q^)  <  r} 

r  is  a  giver,  positive  scalar  and  q^,  q^  are  given  d  istr  ibut  ions  in 
Pn.  Problem  (E)  just  defined  is  a  special  case  of  problem  (P) ,  des¬ 
cribed  in  Section  2  with:  1=0  (i.e.,  no  linear  constraints),  K  =  {1} 

and  with  c  (t) ,  c. (t)  corresponding  here  to  the  discrete  finite  dis- 

o  l 

tributior.s  q^,  q^  respectively.  Moreover,  since  problem  (E) 
consists  of  minimizing  continuous  function  over  the  compact  set  XP(r), 
the  minimum  is  attained;  we  know  also  from  Theorem  2.2  that  the  dual 
problem  (H)  corresponding  to  (E)  involves  only  nonnegative  constraints, 
hence  satisfying  the  strongest  constraint  qualifications;  we  get 
according  to  Theorem  2.2  and  Theorem  2.3,  by  setting 


p=l+X^=l+5  and  =  r. 


Theorem  3.1  A  dual  representation  of  (S)  is  the  program 


(H)  e(r)  =  ma* 
<5>0 


(-:r  -  qj*S  )  } 


ft  r\  * 

Moreover,  if  q  £  F  solves  (F.)  and  6  >.  0  solves  (H)  then 


_6_  _1_ 
1  +  5'  1+5 

*  qlk  q2k 


qk  = 


J- 

r  1+5'  1+5 
l  qlk  q2k 

k 


We  recover  here,  a  result  obtained  in  ((Ref. 14),  Theorem  7). 

We  now  derive  the  dual  representation  of  E(R)  by  reference 
to  the  error  exponent  function  e(r). 

Recalling  the  definition  of  the  reliability  rate  function  given  in 
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the  Introduction  (see  eq.  (3))  and  using  our  notations  we  have: 


E(R)  =  max  min  J(Q,P)  (10) 

p  £(pn  q€Q.(  R) 

where 

<2(R)  =  (Q:  I  (P/Q)  $  R}. 

A  useful  identity  for  the  average  mutual  information  is 


I (p.Q)  =  min  J(Q,q)  (11) 

q£ipn 

where 

J<c,q):=  l  l  p.QOtljJiog 

k=l  j=l  J  qk 


this  can  be  verified  by  observing  that  the  minimum  is  achieved  for 

\  =  2  PjV 


Usina  (11) ,  problem 


E (R)  =  max  min  {C (Q,P)  :  min  J(Q,q)  S  r}  .  (12) 

p€S>nQ€£(R)  q6ipn 


Now  it  is  an  easy  exercise  to  show  that  any  optimization  problem 

of  the  form  min{f(x):  min  g(x,y)  S  r}  is  equivalent  to 

min  { f (x) :  g(x,y)  $  r)  hence  (12)  becomes 
x,y 

E (R)  =  max  min  min  (J(Q,P):  J (Q,q)  $  R}  .  (13) 

P  q  Q 

The  inner  minimum  in  (13)  is  of  the  form  of  e(r)  in  problem  (E), 
and  is  appropriately  denoted  by  e(R,q).  Then  by  Theorem  3.1,  a 
dual  representation  of  it  is  easily  shown  to  be: 

1  <5  u5 

e (R,q)  =  max  t - 5 R  -  logjl  Ip  P^+^  q^+5  >  (14) 

6*0  lk  j  J 

Substituting  the  latter  representation  in  (13),  we  get 


E (R)  =  max  min  max  { g (q , 6 )  -  5r) 
p  q  ^  50 


(15 


where  ^  g 

g(q,6):=  -log  (l  Ip  P^  q1  +  <5j  <16 

j  3  k]  J 

We  shall  prove  that  the  "min-max"  appearing  in  (15)  can  be  reversed 
Before,  we  need  an  auxiliary  result. 

Lemma  3 . 1  The  function  g(q,6)  defined  in  (16)  is 


(a) 

concave  in 

6 

for 

any 

q  £  IP 

(b) 

convex  in 

q 

for 

any 

6*0. 

Proof:  (a)  It  is  well  known  that  the  Lagrangian  dual  function  is 

always  concave  in  the  dual  variables,  hence  (a)  follows. 

(b)  Let  f:  H  -*■  ]R  be  a  convex  decreasing  function,  and  let 
g:  Rn  ->-lR  be  a  concave  function,  then  it:  is  easy  to  verify  that 


h(x)  =  f(g(x))  is  convex.  ^ 

V  T+"<5" 

Take  f(t)  =  -log  t  (convex  decreasing),  g (q)  =  ia^q^  with 


V  r-1+5 

LpT 


a^:=  Zp.P^.  >0  (concave  for  6  >.  0)  then  clearly 

i  3  ] 

g(q,6)  =  (l+5)f(g(q))  and  (b)  is  proved. 


The  min-max  theorem  related  to  (15)  now  follows. 


Theorem  3 . 1  Let  K(q,<5)  =  g(q,5)  -  5R 


min  max  K(q,<5)  =  max  min  K(q,6)  (17) 

q  5*0  65 0  q 

Proof:  By  Lemma  3.1,  K(q,5)  is  a  convex-concave  saddle  function 
for  every  q  £  pn  and  every  <5*0.  By  a  result  of  Rockafeller 
(Ref.  19),  a  sufficient  condition  for  the  validity  of  (17)  for  a 


general  convex-concave  saddle  function  is: 

5  0  such  that 
o 

q  ||  (q,S)  50  (qCp",  i  >0). 

This  is  certainly  satisfied  if: 

3  q,  3  <5  >  0  such  that  —•  (q,6)  <0, 

do 

i.e. , 

3  q,  30  >  0:  g ' (q,  6 )  =  ~  g(q,6)  <  R  .  (18) 

Since  R  >  0,  it  suffices  to  prove  that: 


inf  g' (q,  6 )  s  0  .  (19) 

650 

But  g'(q,5)  is  a  derivative  of  a  concave  function  and  thus  is 
decreasing,  hence 

inf  g '  (q ,  6)  =  lim  g' (q,5)  .  (20) 

650  6-*» 

Moreover,  the  gradient  inequality  for  the  concave  function  g(q,*) 
implies : 

0  =  g (q,0)  $  g(q,6)  -  5g’(q,5) 

hence : 

,  ,  r>  .  g(q,6) 

g  (q,0)  $  -2— ^ — 
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The  last  theorem  permits  us  to  write  E(R)  (see,  eq.  (15))  as: 


E(R)  =  max  max  min  K(q,5) 
p  s^O  q 

However,  the  next  result,  will  show  that  the  inner  minimum  can  be 
computed,  and  thus  E(R)  can  be  expressed  simply  as  a  double  maximum 
problem. 


Lemma  3 . 2 


max  log 
x£x 


L+a 


log  £yi 


(a  >  0) 


where 


X  =  {x  £  Rn:  x  *  0  l  x  =  1}  . 

*  k=l 

Proof :  From  Holder  inequality  we  get 

0  *  (k)  (k”)  • 

Taking  log  of  both  expressions  and  using  the  fact  that  lx  =  1, 


we  get: 


sup  log^Ix^+a  yR^  *  log  ly^ 


l+a 

k 


l+a 

*  yk 

and  the  sup  is  attained  for  x,  =  — : . 

k  ^*y  1  +a 


Now,  since 


min  K(q,6)  =  -5R  -  max  g(5,q), 

q  q 


1  +  5 


using  Lemma  3.2  with  X^^q^  and  y^:  =  Ip.P  '  ,  a  final  expression 

3  3  3 

for  the  reliability  rate  function  E (R)  is: 


y  /y  -i*s  'u<  l 
' 109  ihp*i  !  I 


E (R)  =  max  max  < -5R 
p  £  pn  5  iO 


( 2  L ) 


k 


This  result  coincides  with  Theorem  18  given  in  (Ref.  14) 
The  second  term  in  (21)  is  the  so-called  Gal lager  function. 
The  dual  representation  (21)  is  useful  for  deriving  efficient 
computational  algorithms,  see  e.g.,  (Ref.  20). 
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